Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 1017209 |
| Missing cells | 2173431 |
| Missing cells (%) | 11.9% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 139.7 MiB |
| Average record size in memory | 144.0 B |
Variable types
| Numeric | 9 |
|---|---|
| DateTime | 1 |
| Categorical | 8 |
StateHoliday is highly imbalanced (88.2%) | Imbalance |
CompetitionOpenSinceMonth has 323348 (31.8%) missing values | Missing |
CompetitionOpenSinceYear has 323348 (31.8%) missing values | Missing |
Promo2SinceWeek has 508031 (49.9%) missing values | Missing |
Promo2SinceYear has 508031 (49.9%) missing values | Missing |
PromoInterval has 508031 (49.9%) missing values | Missing |
Sales has 172871 (17.0%) zeros | Zeros |
Customers has 172869 (17.0%) zeros | Zeros |
Reproduction
| Analysis started | 2024-01-12 17:55:33.883842 |
|---|---|
| Analysis finished | 2024-01-12 17:55:58.466092 |
| Duration | 24.58 seconds |
| Software version | ydata-profiling vv4.6.4 |
| Download configuration | config.json |
Store
Real number (ℝ)
| Distinct | 1115 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 558.42973 |
| Minimum | 1 |
|---|---|
| Maximum | 1115 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 56 |
| Q1 | 280 |
| median | 558 |
| Q3 | 838 |
| 95-th percentile | 1060 |
| Maximum | 1115 |
| Range | 1114 |
| Interquartile range (IQR) | 558 |
Descriptive statistics
| Standard deviation | 321.90865 |
|---|---|
| Coefficient of variation (CV) | 0.57645329 |
| Kurtosis | -1.2005237 |
| Mean | 558.42973 |
| Median Absolute Deviation (MAD) | 279 |
| Skewness | -0.00095487998 |
| Sum | 5.6803974 × 108 |
| Variance | 103625.18 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 942 | 0.1% |
| 726 | 942 | 0.1% |
| 708 | 942 | 0.1% |
| 709 | 942 | 0.1% |
| 713 | 942 | 0.1% |
| 714 | 942 | 0.1% |
| 715 | 942 | 0.1% |
| 717 | 942 | 0.1% |
| 718 | 942 | 0.1% |
| 720 | 942 | 0.1% |
| Other values (1105) | 1007789 |
| Value | Count | Frequency (%) |
| 1 | 942 | |
| 2 | 942 | |
| 3 | 942 | |
| 4 | 942 | |
| 5 | 942 | |
| 6 | 942 | |
| 7 | 942 | |
| 8 | 942 | |
| 9 | 942 | |
| 10 | 942 |
| Value | Count | Frequency (%) |
| 1115 | 942 | |
| 1114 | 942 | |
| 1113 | 942 | |
| 1112 | 942 | |
| 1111 | 942 | |
| 1110 | 942 | |
| 1109 | 758 | |
| 1108 | 942 | |
| 1107 | 758 | |
| 1106 | 942 |
DayOfWeek
Real number (ℝ)
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.9983406 |
| Minimum | 1 |
|---|---|
| Maximum | 7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 7 |
| Maximum | 7 |
| Range | 6 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 1.997391 |
|---|---|
| Coefficient of variation (CV) | 0.49955499 |
| Kurtosis | -1.2468733 |
| Mean | 3.9983406 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.0015928228 |
| Sum | 4067148 |
| Variance | 3.9895707 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 145845 | |
| 4 | 145845 | |
| 3 | 145665 | |
| 2 | 145664 | |
| 1 | 144730 | |
| 7 | 144730 | |
| 6 | 144730 |
| Value | Count | Frequency (%) |
| 1 | 144730 | |
| 2 | 145664 | |
| 3 | 145665 | |
| 4 | 145845 | |
| 5 | 145845 | |
| 6 | 144730 | |
| 7 | 144730 |
| Value | Count | Frequency (%) |
| 7 | 144730 | |
| 6 | 144730 | |
| 5 | 145845 | |
| 4 | 145845 | |
| 3 | 145665 | |
| 2 | 145664 | |
| 1 | 144730 |
Date
Date
| Distinct | 942 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 MiB |
| Minimum | 2013-01-01 00:00:00 |
|---|---|
| Maximum | 2015-07-31 00:00:00 |
Sales
Real number (ℝ)
ZEROS 
| Distinct | 21734 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5773.819 |
| Minimum | 0 |
|---|---|
| Maximum | 41551 |
| Zeros | 172871 |
| Zeros (%) | 17.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3727 |
| median | 5744 |
| Q3 | 7856 |
| 95-th percentile | 12137 |
| Maximum | 41551 |
| Range | 41551 |
| Interquartile range (IQR) | 4129 |
Descriptive statistics
| Standard deviation | 3849.9262 |
|---|---|
| Coefficient of variation (CV) | 0.66679025 |
| Kurtosis | 1.7783747 |
| Mean | 5773.819 |
| Median Absolute Deviation (MAD) | 2067 |
| Skewness | 0.64145962 |
| Sum | 5.8731806 × 109 |
| Variance | 14821932 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 172871 | 17.0% |
| 5674 | 215 | < 0.1% |
| 5558 | 197 | < 0.1% |
| 5483 | 196 | < 0.1% |
| 6214 | 195 | < 0.1% |
| 6049 | 195 | < 0.1% |
| 5723 | 194 | < 0.1% |
| 5449 | 192 | < 0.1% |
| 5140 | 191 | < 0.1% |
| 5489 | 191 | < 0.1% |
| Other values (21724) | 842572 |
| Value | Count | Frequency (%) |
| 0 | 172871 | |
| 46 | 1 | < 0.1% |
| 124 | 1 | < 0.1% |
| 133 | 1 | < 0.1% |
| 286 | 1 | < 0.1% |
| 297 | 1 | < 0.1% |
| 316 | 1 | < 0.1% |
| 416 | 1 | < 0.1% |
| 506 | 1 | < 0.1% |
| 520 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 41551 | 1 | |
| 38722 | 1 | |
| 38484 | 1 | |
| 38367 | 1 | |
| 38037 | 1 | |
| 38025 | 1 | |
| 37646 | 1 | |
| 37403 | 1 | |
| 37376 | 1 | |
| 37122 | 1 |
Customers
Real number (ℝ)
ZEROS 
| Distinct | 4086 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 633.14595 |
| Minimum | 0 |
|---|---|
| Maximum | 7388 |
| Zeros | 172869 |
| Zeros (%) | 17.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 405 |
| median | 609 |
| Q3 | 837 |
| 95-th percentile | 1362 |
| Maximum | 7388 |
| Range | 7388 |
| Interquartile range (IQR) | 432 |
Descriptive statistics
| Standard deviation | 464.41173 |
|---|---|
| Coefficient of variation (CV) | 0.73349871 |
| Kurtosis | 7.0917727 |
| Mean | 633.14595 |
| Median Absolute Deviation (MAD) | 216 |
| Skewness | 1.5986503 |
| Sum | 6.4404176 × 108 |
| Variance | 215678.26 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 172869 | 17.0% |
| 560 | 2414 | 0.2% |
| 576 | 2363 | 0.2% |
| 603 | 2337 | 0.2% |
| 571 | 2330 | 0.2% |
| 555 | 2328 | 0.2% |
| 566 | 2327 | 0.2% |
| 517 | 2326 | 0.2% |
| 539 | 2309 | 0.2% |
| 651 | 2299 | 0.2% |
| Other values (4076) | 823307 |
| Value | Count | Frequency (%) |
| 0 | 172869 | |
| 3 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 13 | 1 | < 0.1% |
| 18 | 1 | < 0.1% |
| 36 | 1 | < 0.1% |
| 40 | 1 | < 0.1% |
| 44 | 1 | < 0.1% |
| 50 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 7388 | 1 | |
| 5494 | 1 | |
| 5458 | 1 | |
| 5387 | 1 | |
| 5297 | 1 | |
| 5192 | 1 | |
| 5152 | 1 | |
| 5145 | 1 | |
| 5132 | 1 | |
| 5112 | 1 |
Open
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 MiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1017209 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 844392 | |
| 0 | 172817 | 17.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 844392 | |
| 0 | 172817 | 17.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 844392 | |
| 0 | 172817 | 17.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1017209 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 844392 | |
| 0 | 172817 | 17.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1017209 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 844392 | |
| 0 | 172817 | 17.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1017209 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 844392 | |
| 0 | 172817 | 17.0% |
Promo
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1017209 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 629129 | |
| 1 | 388080 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 629129 | |
| 1 | 388080 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 629129 | |
| 1 | 388080 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1017209 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 629129 | |
| 1 | 388080 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1017209 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 629129 | |
| 1 | 388080 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1017209 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 629129 | |
| 1 | 388080 |
StateHoliday
Categorical
IMBALANCE 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 MiB |
| 0 | |
|---|---|
| a | 20260 |
| b | 6690 |
| c | 4100 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1017209 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 986159 | |
| a | 20260 | 2.0% |
| b | 6690 | 0.7% |
| c | 4100 | 0.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 986159 | |
| a | 20260 | 2.0% |
| b | 6690 | 0.7% |
| c | 4100 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 986159 | |
| a | 20260 | 2.0% |
| b | 6690 | 0.7% |
| c | 4100 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 986159 | |
| Lowercase Letter | 31050 | 3.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 20260 | |
| b | 6690 | 21.5% |
| c | 4100 | 13.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 986159 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 986159 | |
| Latin | 31050 | 3.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 20260 | |
| b | 6690 | 21.5% |
| c | 4100 | 13.2% |
Common
| Value | Count | Frequency (%) |
| 0 | 986159 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1017209 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 986159 | |
| a | 20260 | 2.0% |
| b | 6690 | 0.7% |
| c | 4100 | 0.4% |
SchoolHoliday
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1017209 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 835488 | |
| 1 | 181721 | 17.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 835488 | |
| 1 | 181721 | 17.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 835488 | |
| 1 | 181721 | 17.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1017209 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 835488 | |
| 1 | 181721 | 17.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1017209 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 835488 | |
| 1 | 181721 | 17.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1017209 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 835488 | |
| 1 | 181721 | 17.9% |
StoreType
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 MiB |
| a | |
|---|---|
| d | |
| c | |
| b | 15830 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1017209 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | c |
|---|---|
| 2nd row | a |
| 3rd row | a |
| 4th row | c |
| 5th row | a |
Common Values
| Value | Count | Frequency (%) |
| a | 551627 | |
| d | 312912 | |
| c | 136840 | 13.5% |
| b | 15830 | 1.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| a | 551627 | |
| d | 312912 | |
| c | 136840 | 13.5% |
| b | 15830 | 1.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 551627 | |
| d | 312912 | |
| c | 136840 | 13.5% |
| b | 15830 | 1.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1017209 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 551627 | |
| d | 312912 | |
| c | 136840 | 13.5% |
| b | 15830 | 1.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1017209 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 551627 | |
| d | 312912 | |
| c | 136840 | 13.5% |
| b | 15830 | 1.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1017209 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 551627 | |
| d | 312912 | |
| c | 136840 | 13.5% |
| b | 15830 | 1.6% |
Assortment
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 MiB |
| a | |
|---|---|
| c | |
| b | 8294 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1017209 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | a |
|---|---|
| 2nd row | a |
| 3rd row | a |
| 4th row | c |
| 5th row | a |
Common Values
| Value | Count | Frequency (%) |
| a | 537445 | |
| c | 471470 | |
| b | 8294 | 0.8% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| a | 537445 | |
| c | 471470 | |
| b | 8294 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 537445 | |
| c | 471470 | |
| b | 8294 | 0.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1017209 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 537445 | |
| c | 471470 | |
| b | 8294 | 0.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1017209 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 537445 | |
| c | 471470 | |
| b | 8294 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1017209 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 537445 | |
| c | 471470 | |
| b | 8294 | 0.8% |
CompetitionDistance
Real number (ℝ)
| Distinct | 654 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 2642 |
| Missing (%) | 0.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5430.0857 |
| Minimum | 20 |
|---|---|
| Maximum | 75860 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.8 MiB |
Quantile statistics
| Minimum | 20 |
|---|---|
| 5-th percentile | 130 |
| Q1 | 710 |
| median | 2330 |
| Q3 | 6890 |
| 95-th percentile | 20390 |
| Maximum | 75860 |
| Range | 75840 |
| Interquartile range (IQR) | 6180 |
Descriptive statistics
| Standard deviation | 7715.3237 |
|---|---|
| Coefficient of variation (CV) | 1.4208475 |
| Kurtosis | 13.000022 |
| Mean | 5430.0857 |
| Median Absolute Deviation (MAD) | 1980 |
| Skewness | 2.928534 |
| Sum | 5.5091857 × 109 |
| Variance | 59526220 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 250 | 11120 | 1.1% |
| 50 | 7536 | 0.7% |
| 350 | 7536 | 0.7% |
| 1200 | 7374 | 0.7% |
| 190 | 7352 | 0.7% |
| 180 | 6594 | 0.6% |
| 90 | 6594 | 0.6% |
| 330 | 6410 | 0.6% |
| 150 | 6226 | 0.6% |
| 2640 | 5652 | 0.6% |
| Other values (644) | 942173 |
| Value | Count | Frequency (%) |
| 20 | 942 | 0.1% |
| 30 | 3767 | |
| 40 | 4710 | |
| 50 | 7536 | |
| 60 | 2826 | 0.3% |
| 70 | 4526 | |
| 80 | 2826 | 0.3% |
| 90 | 6594 | |
| 100 | 4710 | |
| 110 | 5468 |
| Value | Count | Frequency (%) |
| 75860 | 942 | |
| 58260 | 942 | |
| 48330 | 942 | |
| 46590 | 942 | |
| 45740 | 942 | |
| 44320 | 942 | |
| 40860 | 942 | |
| 40540 | 942 | |
| 38710 | 942 | |
| 38630 | 942 |
CompetitionOpenSinceMonth
Real number (ℝ)
MISSING 
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 323348 |
| Missing (%) | 31.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.222866 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 4 |
| median | 8 |
| Q3 | 10 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.2118321 |
|---|---|
| Coefficient of variation (CV) | 0.44467558 |
| Kurtosis | -1.248357 |
| Mean | 7.222866 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.16986163 |
| Sum | 5011665 |
| Variance | 10.315866 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 9 | 114254 | 11.2% |
| 4 | 87076 | 8.6% |
| 11 | 84455 | 8.3% |
| 3 | 63548 | 6.2% |
| 7 | 59434 | 5.8% |
| 12 | 57896 | 5.7% |
| 10 | 55622 | 5.5% |
| 6 | 45444 | 4.5% |
| 5 | 39608 | 3.9% |
| 2 | 37886 | 3.7% |
| Other values (2) | 48638 | 4.8% |
| (Missing) | 323348 |
| Value | Count | Frequency (%) |
| 1 | 12452 | 1.2% |
| 2 | 37886 | 3.7% |
| 3 | 63548 | |
| 4 | 87076 | |
| 5 | 39608 | 3.9% |
| 6 | 45444 | 4.5% |
| 7 | 59434 | |
| 8 | 36186 | 3.6% |
| 9 | 114254 | |
| 10 | 55622 |
| Value | Count | Frequency (%) |
| 12 | 57896 | |
| 11 | 84455 | |
| 10 | 55622 | |
| 9 | 114254 | |
| 8 | 36186 | 3.6% |
| 7 | 59434 | |
| 6 | 45444 | 4.5% |
| 5 | 39608 | 3.9% |
| 4 | 87076 | |
| 3 | 63548 |
CompetitionOpenSinceYear
Real number (ℝ)
MISSING 
| Distinct | 23 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 323348 |
| Missing (%) | 31.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2008.6902 |
| Minimum | 1900 |
|---|---|
| Maximum | 2015 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.8 MiB |
Quantile statistics
| Minimum | 1900 |
|---|---|
| 5-th percentile | 2001 |
| Q1 | 2006 |
| median | 2010 |
| Q3 | 2013 |
| 95-th percentile | 2015 |
| Maximum | 2015 |
| Range | 115 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 5.9926444 |
|---|---|
| Coefficient of variation (CV) | 0.0029833592 |
| Kurtosis | 121.93467 |
| Mean | 2008.6902 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -7.5395149 |
| Sum | 1.3937518 × 109 |
| Variance | 35.911787 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2013 | 75426 | 7.4% |
| 2012 | 74299 | 7.3% |
| 2014 | 63732 | 6.3% |
| 2005 | 56564 | 5.6% |
| 2010 | 51258 | 5.0% |
| 2011 | 49396 | 4.9% |
| 2009 | 49396 | 4.9% |
| 2008 | 48476 | 4.8% |
| 2007 | 43744 | 4.3% |
| 2006 | 42802 | 4.2% |
| Other values (13) | 138768 | |
| (Missing) | 323348 |
| Value | Count | Frequency (%) |
| 1900 | 758 | 0.1% |
| 1961 | 942 | 0.1% |
| 1990 | 4710 | 0.5% |
| 1994 | 1884 | 0.2% |
| 1995 | 1700 | 0.2% |
| 1998 | 942 | 0.1% |
| 1999 | 7352 | 0.7% |
| 2000 | 9236 | 0.9% |
| 2001 | 14704 | |
| 2002 | 24882 |
| Value | Count | Frequency (%) |
| 2015 | 35060 | |
| 2014 | 63732 | |
| 2013 | 75426 | |
| 2012 | 74299 | |
| 2011 | 49396 | |
| 2010 | 51258 | |
| 2009 | 49396 | |
| 2008 | 48476 | |
| 2007 | 43744 | |
| 2006 | 42802 |
Promo2
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 MiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1017209 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 509178 | |
| 0 | 508031 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 509178 | |
| 0 | 508031 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 509178 | |
| 0 | 508031 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1017209 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 509178 | |
| 0 | 508031 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1017209 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 509178 | |
| 0 | 508031 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1017209 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 509178 | |
| 0 | 508031 |
Promo2SinceWeek
Real number (ℝ)
MISSING 
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 508031 |
| Missing (%) | 49.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.269093 |
| Minimum | 1 |
|---|---|
| Maximum | 50 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 13 |
| median | 22 |
| Q3 | 37 |
| 95-th percentile | 45 |
| Maximum | 50 |
| Range | 49 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 14.095973 |
|---|---|
| Coefficient of variation (CV) | 0.60578093 |
| Kurtosis | -1.3699286 |
| Mean | 23.269093 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 0.10452752 |
| Sum | 11848110 |
| Variance | 198.69644 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 14 | 72990 | 7.2% |
| 40 | 62598 | 6.2% |
| 31 | 39976 | 3.9% |
| 10 | 38828 | 3.8% |
| 5 | 35818 | 3.5% |
| 37 | 32786 | 3.2% |
| 1 | 32418 | 3.2% |
| 13 | 29820 | 2.9% |
| 45 | 29268 | 2.9% |
| 22 | 28694 | 2.8% |
| Other values (14) | 105982 | 10.4% |
| (Missing) | 508031 |
| Value | Count | Frequency (%) |
| 1 | 32418 | |
| 5 | 35818 | |
| 6 | 942 | 0.1% |
| 9 | 12452 | 1.2% |
| 10 | 38828 | |
| 13 | 29820 | |
| 14 | 72990 | |
| 18 | 27318 | 2.7% |
| 22 | 28694 | 2.8% |
| 23 | 4342 | 0.4% |
| Value | Count | Frequency (%) |
| 50 | 942 | 0.1% |
| 49 | 758 | 0.1% |
| 48 | 8294 | 0.8% |
| 45 | 29268 | |
| 44 | 2642 | 0.3% |
| 40 | 62598 | |
| 39 | 4732 | 0.5% |
| 37 | 32786 | |
| 36 | 9236 | 0.9% |
| 35 | 22814 | 2.2% |
Promo2SinceYear
Real number (ℝ)
MISSING 
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 508031 |
| Missing (%) | 49.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2011.7528 |
| Minimum | 2009 |
|---|---|
| Maximum | 2015 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.8 MiB |
Quantile statistics
| Minimum | 2009 |
|---|---|
| 5-th percentile | 2009 |
| Q1 | 2011 |
| median | 2012 |
| Q3 | 2013 |
| 95-th percentile | 2014 |
| Maximum | 2015 |
| Range | 6 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.6628704 |
|---|---|
| Coefficient of variation (CV) | 0.00082657792 |
| Kurtosis | -1.0406623 |
| Mean | 2011.7528 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.12005992 |
| Sum | 1.0243403 × 109 |
| Variance | 2.7651381 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2011 | 115056 | 11.3% |
| 2013 | 110464 | 10.9% |
| 2014 | 79922 | 7.9% |
| 2012 | 73174 | 7.2% |
| 2009 | 65270 | 6.4% |
| 2010 | 56240 | 5.5% |
| 2015 | 9052 | 0.9% |
| (Missing) | 508031 |
| Value | Count | Frequency (%) |
| 2009 | 65270 | |
| 2010 | 56240 | |
| 2011 | 115056 | |
| 2012 | 73174 | |
| 2013 | 110464 | |
| 2014 | 79922 | |
| 2015 | 9052 | 0.9% |
| Value | Count | Frequency (%) |
| 2015 | 9052 | 0.9% |
| 2014 | 79922 | |
| 2013 | 110464 | |
| 2012 | 73174 | |
| 2011 | 115056 | |
| 2010 | 56240 | |
| 2009 | 65270 |
PromoInterval
Categorical
MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 508031 |
| Missing (%) | 49.9% |
| Memory size | 7.8 MiB |
| Jan,Apr,Jul,Oct | |
|---|---|
| Feb,May,Aug,Nov | |
| Mar,Jun,Sept,Dec |
Length
| Max length | 16 |
|---|---|
| Median length | 15 |
| Mean length | 15.191407 |
| Min length | 15 |
Characters and Unicode
| Total characters | 7735130 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Jan,Apr,Jul,Oct |
|---|---|
| 2nd row | Jan,Apr,Jul,Oct |
| 3rd row | Jan,Apr,Jul,Oct |
| 4th row | Jan,Apr,Jul,Oct |
| 5th row | Feb,May,Aug,Nov |
Common Values
| Value | Count | Frequency (%) |
| Jan,Apr,Jul,Oct | 293122 | |
| Feb,May,Aug,Nov | 118596 | 11.7% |
| Mar,Jun,Sept,Dec | 97460 | 9.6% |
| (Missing) | 508031 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| jan,apr,jul,oct | 293122 | |
| feb,may,aug,nov | 118596 | |
| mar,jun,sept,dec | 97460 | 19.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| , | 1527534 | |
| J | 683704 | 8.8% |
| u | 509178 | 6.6% |
| a | 509178 | 6.6% |
| A | 411718 | 5.3% |
| c | 390582 | 5.0% |
| t | 390582 | 5.0% |
| r | 390582 | 5.0% |
| p | 390582 | 5.0% |
| n | 390582 | 5.0% |
| Other values (13) | 2140908 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4170884 | |
| Uppercase Letter | 2036712 | |
| Other Punctuation | 1527534 | 19.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| u | 509178 | |
| a | 509178 | |
| c | 390582 | |
| t | 390582 | |
| r | 390582 | |
| p | 390582 | |
| n | 390582 | |
| e | 313516 | |
| l | 293122 | |
| b | 118596 | 2.8% |
| Other values (4) | 474384 |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 683704 | |
| A | 411718 | |
| O | 293122 | |
| M | 216056 | 10.6% |
| F | 118596 | 5.8% |
| N | 118596 | 5.8% |
| S | 97460 | 4.8% |
| D | 97460 | 4.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 1527534 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6207596 | |
| Common | 1527534 | 19.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| J | 683704 | 11.0% |
| u | 509178 | 8.2% |
| a | 509178 | 8.2% |
| A | 411718 | 6.6% |
| c | 390582 | 6.3% |
| t | 390582 | 6.3% |
| r | 390582 | 6.3% |
| p | 390582 | 6.3% |
| n | 390582 | 6.3% |
| e | 313516 | 5.1% |
| Other values (12) | 1827392 |
Common
| Value | Count | Frequency (%) |
| , | 1527534 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7735130 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| , | 1527534 | |
| J | 683704 | 8.8% |
| u | 509178 | 6.6% |
| a | 509178 | 6.6% |
| A | 411718 | 5.3% |
| c | 390582 | 5.0% |
| t | 390582 | 5.0% |
| r | 390582 | 5.0% |
| p | 390582 | 5.0% |
| n | 390582 | 5.0% |
| Other values (13) | 2140908 |
| Store | DayOfWeek | Date | Sales | Customers | Open | Promo | StateHoliday | SchoolHoliday | StoreType | Assortment | CompetitionDistance | CompetitionOpenSinceMonth | CompetitionOpenSinceYear | Promo2 | Promo2SinceWeek | Promo2SinceYear | PromoInterval | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 5 | 2015-07-31 | 5263 | 555 | 1 | 1 | 0 | 1 | c | a | 1270.0 | 9.0 | 2008.0 | 0 | NaN | NaN | NaN |
| 1 | 2 | 5 | 2015-07-31 | 6064 | 625 | 1 | 1 | 0 | 1 | a | a | 570.0 | 11.0 | 2007.0 | 1 | 13.0 | 2010.0 | Jan,Apr,Jul,Oct |
| 2 | 3 | 5 | 2015-07-31 | 8314 | 821 | 1 | 1 | 0 | 1 | a | a | 14130.0 | 12.0 | 2006.0 | 1 | 14.0 | 2011.0 | Jan,Apr,Jul,Oct |
| 3 | 4 | 5 | 2015-07-31 | 13995 | 1498 | 1 | 1 | 0 | 1 | c | c | 620.0 | 9.0 | 2009.0 | 0 | NaN | NaN | NaN |
| 4 | 5 | 5 | 2015-07-31 | 4822 | 559 | 1 | 1 | 0 | 1 | a | a | 29910.0 | 4.0 | 2015.0 | 0 | NaN | NaN | NaN |
| 5 | 6 | 5 | 2015-07-31 | 5651 | 589 | 1 | 1 | 0 | 1 | a | a | 310.0 | 12.0 | 2013.0 | 0 | NaN | NaN | NaN |
| 6 | 7 | 5 | 2015-07-31 | 15344 | 1414 | 1 | 1 | 0 | 1 | a | c | 24000.0 | 4.0 | 2013.0 | 0 | NaN | NaN | NaN |
| 7 | 8 | 5 | 2015-07-31 | 8492 | 833 | 1 | 1 | 0 | 1 | a | a | 7520.0 | 10.0 | 2014.0 | 0 | NaN | NaN | NaN |
| 8 | 9 | 5 | 2015-07-31 | 8565 | 687 | 1 | 1 | 0 | 1 | a | c | 2030.0 | 8.0 | 2000.0 | 0 | NaN | NaN | NaN |
| 9 | 10 | 5 | 2015-07-31 | 7185 | 681 | 1 | 1 | 0 | 1 | a | a | 3160.0 | 9.0 | 2009.0 | 0 | NaN | NaN | NaN |
| Store | DayOfWeek | Date | Sales | Customers | Open | Promo | StateHoliday | SchoolHoliday | StoreType | Assortment | CompetitionDistance | CompetitionOpenSinceMonth | CompetitionOpenSinceYear | Promo2 | Promo2SinceWeek | Promo2SinceYear | PromoInterval | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1017199 | 1106 | 2 | 2013-01-01 | 0 | 0 | 0 | 0 | a | 1 | a | c | 5330.0 | 9.0 | 2011.0 | 1 | 31.0 | 2013.0 | Jan,Apr,Jul,Oct |
| 1017200 | 1107 | 2 | 2013-01-01 | 0 | 0 | 0 | 0 | a | 1 | a | a | 1400.0 | 6.0 | 2012.0 | 1 | 13.0 | 2010.0 | Jan,Apr,Jul,Oct |
| 1017201 | 1108 | 2 | 2013-01-01 | 0 | 0 | 0 | 0 | a | 1 | a | a | 540.0 | 4.0 | 2004.0 | 0 | NaN | NaN | NaN |
| 1017202 | 1109 | 2 | 2013-01-01 | 0 | 0 | 0 | 0 | a | 1 | c | a | 3490.0 | 4.0 | 2011.0 | 1 | 22.0 | 2012.0 | Jan,Apr,Jul,Oct |
| 1017203 | 1110 | 2 | 2013-01-01 | 0 | 0 | 0 | 0 | a | 1 | c | c | 900.0 | 9.0 | 2010.0 | 0 | NaN | NaN | NaN |
| 1017204 | 1111 | 2 | 2013-01-01 | 0 | 0 | 0 | 0 | a | 1 | a | a | 1900.0 | 6.0 | 2014.0 | 1 | 31.0 | 2013.0 | Jan,Apr,Jul,Oct |
| 1017205 | 1112 | 2 | 2013-01-01 | 0 | 0 | 0 | 0 | a | 1 | c | c | 1880.0 | 4.0 | 2006.0 | 0 | NaN | NaN | NaN |
| 1017206 | 1113 | 2 | 2013-01-01 | 0 | 0 | 0 | 0 | a | 1 | a | c | 9260.0 | NaN | NaN | 0 | NaN | NaN | NaN |
| 1017207 | 1114 | 2 | 2013-01-01 | 0 | 0 | 0 | 0 | a | 1 | a | c | 870.0 | NaN | NaN | 0 | NaN | NaN | NaN |
| 1017208 | 1115 | 2 | 2013-01-01 | 0 | 0 | 0 | 0 | a | 1 | d | c | 5350.0 | NaN | NaN | 1 | 22.0 | 2012.0 | Mar,Jun,Sept,Dec |